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Abstract. Balanced minimum evolution (BME) is a statistically con- 
^ sistent distance-based method to reconstruct a phylogenetic tree from 

an alignment of molecular data. In 2000, Pauplin showed that the BME 
method is equivalent to optimizing a linear functional over the BME 
polytope, the convex hull of the BME vectors obtained from Pauplin's 
formula applied to all binary trees. The BME method is related to the 
Neighbor Joining (NJ) algorithm, now known to be a greedy optimiza- 
tion of the BME principle. Further, the NJ and BME algorithms have 
O been studied previously to understand when the NJ Algorithm returns 

' a BME tree for small numbers of taxa. In this paper we aim to elucidate 

I the structure of the BME polytope and strengthen knowledge of the con- 

nection between the BME method and NJ Algorithm. We first prove that 
any subtree-prune-regraft move from a binary tree to another binary tree 
corresponds to an edge of the BME polytope. Moreover, we describe an 
^ entire family of faces parametrized by disjoint clades. We show that these 

pT^ clade-faces are smaller dimensional BME poljrtopes themselves. Finally, 

f — we show that for any order of joining nodes to form a tree, there exists 

an associated distance matrix (i.e., dissimilarity map) for which the NJ 
Algorithm returns the BME tree. More strongly, we show that the BME 
cone and every NJ cone associated to a tree T have an intersection of 
positive measure. 

o 
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^ 1 Introduction 



Current efforts to reconstruct the tree of life for different organisms demand the 
inference of phylogenies from thousands of DNA sequences (see |http://tolweb.org/tree/ 
[T] and m for more details). Large scale projects include the investigation of the 
tree of life for flies, by researchers at North Carolina State University 
( |http:// www. inhs . Illinois . edu/research/FLYTREE/), the tree of life for fungi, 
at Duke University fhttpt/ZaftoLorg'/), and at the University of Kentucky, 
the tree of life for the insect order Hymenoptera (http://www.hymatol.org/). 
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The most established approach to tree reconstruction is the maximum Hkeh- 
hood (ML) method. In this method, evolution is described in terms of a discrete- 
state continuous-time Markov process on a phylogenetic tree. Unfortunately, an 
exhaustive search for the ML phylogenetic tree is computationally prohibitive for 
large data sets [20]. However, one can efficiently compute a pairwise distance, a 
distance between a pair of leaves, using the ML method. The pairwise distances 
can then be used, together with a distanced-based tree reconstruction method, 
to recover the phylogenetic tree that relates the sequences [T7j, albeit at a loss 
of accuracy. To date, distance-based methods for phylogeny reconstruction have 
been seen to be the best hope for accurately building phylogenies on very large 
sets of taxa such as the data sets for tree of life for Hymenoptera [131 [23]. More 
precisely, distance-based methods have been shown to be statistically consistent 
in all settings ( such as the long branch attraction) in contrast with parsimony 
methods [H [TOl [HI [H] . Distance-based methods also have a huge speed advan- 
tage over parsimony and likelihood methods, and hence enable the reconstruction 
of trees on greater numbers of taxa. 

In 2002, Desper and Gascuel introduced a balanced minimum evolution 
(BME) principle, based on a branch length estimation scheme of Pauplin [T^ . 
The guiding principle of minimum evolution tree reconstruction methods is to 
return a tree whose total length (sum of branch lengths) is minimal, given an 
input dissimilarity map. The BME method is a special case of these distance- 
based methods wherein branch lengths are estimated by a weighted least-squares 
method (in terms of the input dissimilarity map and the tree in question) that 
puts more emphasis on shorter distances than longer ones. Each labeled tree 
topology gives rise to a vector, called herein the BME vector, which is obtained 
from Pauplin's formula. 

Implementing, exploring, and better understanding the BME method have 
been focal points of several recent works. The software FastME, developed by 
Desper and Gascuel, heuristically optimizes the BME principle using nearest- 
neighbor interchanges (NNI) [T^]. In simulations, FastME gives superior trees 
compared to other distance-based methods, including one of biologists' most 
popular distance-based methods, the Neighbor Joining (NJ) Algorithm, devel- 
oped by Saitou and Nei [21]. In 2000, Pauplin showed that the BME method is 
equivalent to optimizing a linear function, the dissimilarity map, over the BME 
representations of binary trees, given by the BME vectors [TH]. Eickmeyer et. 
al. defined the n*'' BME polytope as the convex hull of the BME vectors for all 
binary trees on a fixed number n of taxa. Hence the BME method is equivalent 
to optimizing a linear function, namely, the input dissimilarity map, over a BME 
polytope. In 2010, Matsen and Cueto [S] studied how the BME method works 
when the addition of an extra taxon to a data set alters the structure of the 
optimal phylogenetic tree. They characterized the behavior of the BME phylo- 
genetics on such data sets, using the BME polytopes and the BME cones, i.e., 
the normal cones of the BME polytope. 
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Eickmeyer et. al. studied the BME polytopes computationally, for unrooted 
phylogenetic trees with eight or fewer taxa. In addition to this computational 
study of the BME polytopes, they showed the following general lemma: 

Lemma 1 (Lemma 3.1 in [14]). For any number of taxa n, the vertices of the 
7j*'i BME polytope are exactly the BME vectors of all unrooted binary trees with 
n leaves. The BME vector of the star phylogeny lies in the interior of the BME 
polytope, and all other BME vectors lie on the boundary of the BME polytope. 

In particular, Eickmeyer et. al. studied edges of the BME polytopes com- 
putationally. They found that the edge graph of the n*'' BME polytope is the 




Fig. 1. The non-edges on the n BME polytope for n ~ 7. Two trees will form 
a non-edge if and only if they are trees that have three cherries, and differ by 
the pair of leaf exchanges shown in the figure. 

complete graph T2 with the same number (< 6) of leaves, there is a dissimilarity 
map for which Ti and T2 are (the only) co-optimal BME trees. However, for 
n = 7, the BME polytope has one combinatorial type of non-edge, i.e., the BME 
vectors of two bifurcating trees with seven leaves and three cherries (two leaves 
adjacent to the unique internal node in the tree) fail to be joined by an edge if 
and only if their trees are related by two leaf exchanges as depicted in Figure [TJ 
This completely characterizes the non-edges for n = 7. 

Characterizing the edges of the n*'' BME polytope for n > 7 remains an 
open problem that motivated this work. Understanding the structure of the 
BME polytope through its edges and faces may help with the development of 
new optimization strategies to find an optimal BME tree. For example, one such 
approach could entail employing an edge-walking method over the edges of the 
BME polytope, since the BME method is a linear programming problem over 
the BME polytope. However, until now, not much was known about the faces of 
the BME polytopes besides vertices (which are trivial to characterize). 

This paper makes contributions towards understanding both edges and higher- 
dimensional faces of the BME polytope. First, we prove that any subtree-prune- 
regraft (SPR) move from a binary tree to another binary tree corresponds to an 
edge of the BME polytope. This implies that any NNI move from a binary tree to 
another binary tree corresponds to an edge of the BME polytope. Consequently, 
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the method implemented in the software FastME is an edge-walking method over 
the edges of the BME polytopc using NNI moves. Moreover, we define and de- 
scribe an entire family of faces of the BME polytope that are parametrized by 
disjoint clades. We show that these dade-faces are smaller dimensional BME 
polytopes themselves. 

The study of related geometric structures, the BME cones, further clarifies 
the nature of the link between phylogenetic tree reconstruction using the BME 
criterion and using the Neighbor Joining (NJ) Algorithm. In 2006, Gascuel and 
Steel showed that the NJ Algorithm, one of the most popular phylogenetic tree 
reconstruction algorithms, is a greedy algorithm for finding the BME tree as- 
sociated to a dissimilarity map [TS]. The Neighbor Joining Algorithm relies on 
a particular criterion for iteratively selecting cherries; details on cherry-picking 
and the NJ Algorithm are recalled later in the paper. In 2008, based on the 
fact that the selection criterion for cherry-picking is linear in the dissimilarity 
map [5], Eickmeyer et. al. showed that the NJ Algorithm will pick cherries to 
merge in a particular order and output a particular tree topology T if and only if 
the pairwise distances satisfy a system of linear inequalities, whose solution set 
forms a polyhedral cone in M.i^) [2]. They defined such a cone as an NJ cone. 
In general, the sequence of cherries chosen by the NJ Algorithm is not unique, 
hence multiple dissimilarity maps will be assigned by the NJ Algorithm to a 
single fixed tree topology T. The set of all dissimilarity maps for which the NJ 
Algorithm returns a fixed tree topology T is a union of NJ cones, however this 
union is not convex in general. Eickmeyer et. al. (14j characterized those dissim- 
ilarity maps for which the NJ Algorithm returns the BME tree, by comparing 
the NJ cones with the BME cones, for eight or fewer taxa. 

Yet, before this paper, it was unclear whether, given a tree topology T with an 
arbitrary number of taxa, and any particular order of picking cherries allowed by 
the NJ Algorithm, there existed a dissimilarity map such that the NJ Algorithm 
would return the BME tree T. We prove this in fact is so, despite the fact that 
greedy algorithms do not generally construct the globally optimal structure for 
the condition which they locally optimize. Interpreted in terms of phylogenetics, 
this is particularly important, as it shows that there is no order of picking cherries 
for which the NJ Algorithm will fail to return the BME tree. Geometrically this 
means that for any NJ cone associated with the tree topology T and a particular 
choice of cherry-picking order, there exists a non-empty intersection with the 
BME cone associated with T. Consequently, given any tree topology T, there 
exists a dissimilarity map such that NJ and BME both return the tree topology 
T. More strongly, we show that the BME cone and every NJ cone associated to 
a tree T have an intersection of positive measure. 

This paper is organized as follows: Definitions and notation are covered in 
Section [2] Subsection |2.4| treats clade-faces of the BME polytope and contains 
a useful proposition concerning objective criteria for greedy linear optimization. 
Section [3] contains the proof that two trees adjacent by an SPR move form an 
edge of the BME polytope. In Section|4]we present the Cherry Forcing Algorithm 
and show that it also provides proof for the existence of clade-faces. Finally, 
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using the Cherry Forcing Algorithm, in Section [5] we prove that every N J cone 
associated with a tree T has a non-empty intersection of positive measure with 
the BME cone associated with T. That is, given a tree T and a sequence of 
cherries chosen by the NJ Algorithm, there is a dissimilarity map such that NJ 
and BME return T. We finish with a discussion in Section |6l 

2 Notation, Definitions and Further Preliminaries 
2.1 Phylogenetic X-trees, Cherries, and Clades 

Let X be a set of leaves, which we also may call taxa; when \X\ = n, we will 
often conveniently identify X with {1, 2, ... n}. A dissimilarity map (or distance 
matrix) is a function d : X x X ^ M. with d(x, x) — and d{x, y) — d{y, x) for all 
x,y £ X. It is convenient to represent a dissimilarity map by a vector d G R^^), 
In general, we index entries of any c G R^'^^ by pairs {i,j} C X with i < j in 
lexicographic order, i.e. c = (ci2, C13, . . . , ci„, C23, • ■ ■ , C2,„, . . . , c„_i,„) G R^^h 
We may also index a set of vectors in M^^) by superscript when necessary, e.g., 
c'^ G M^^) with ijth coordinate cfy Define e^j G E^^) to be the vector with 1 at 

the ijth entry and else. Let M^"'* = {x G M^^) | Xij > for all 1 < i < j < n}. 

Mathematically, a tree is an undirected graph in which any two vertices are 
connected by exactly one simple path; the number of edges incident to any vertex 
(i.e., node) x of the tree is the degree deg{x) of x. If the graph consists of more 
than a single vertex, a node x with deg{x) = 1 is external, or a leaf] all other 
nodes are internal. A phylogenetic X-tree is a tree T with set of leaves X and all 
internal vertices of degree at least three. Those for which the internal vertices are 
all of degree three are here called binary X-trees (or just binary trees, when the 
context is clear). For n = \X\ > 3 the binary X-trees are necessarily unrooted 
trees, and for n > A, correspond in phylogenetics to unrooted cladograms with 
no polytomy. Let Tn be the set of all binary trees with n leaves; we will assume 
throughout n > 3. Write E{T) for the set of edges (i.e., branches) oi T € Tn- 
An edge e G E{T) is internal (resp., external) if it does not (resp., does) touch 
a leaf. A cherry of T G is a pair of leaves such that the path between 

them consists of just two (necessarily external) edges. An edge-weighting (or 
branch length assignment) of T is a function w : E{T) — >■ R with uj{e) > 
for every e G E{T). Given an edge weighting uj, define the total tree length 
w(T) := EeeB(T)'^(e)- 

An X-split is a partition A\B oi X into two subsets (blocks) A,B C X. 
Any edge e G E{T) oi T & Tn induces an X-spht Ai \ A2 by deleting e from 
T and letting Ai be the subset of leaves associated to the resulting connected 
component Q, i = 1, 2, of T. Conversely any X-split Ai \ A2 corresponds to the 
edge that when deleted gives the split. When e G E{T) is internal, we will call 
Ci a clade, and Ai the support supp(Ci) of Ci. When the context is clear, we 
may identify a clade C with its support supp(C). By allowing for the case of 
choosing no edge e, that is, the trivial X-split | X, we obtain T itself as a clade. 
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Further, we simply say a clade C is in 7^ (and write C € Tn) if C is a clade for 
some tree in Tn- 

For example, leaves {1,2} define a clade of Ti of Figure [3(a)| whereas {2,3} 
is not a clade of Ti since there is no subgraph containing {2, 3} attainable by 
removing an internal edge of Ti . We say two clades Ci , 6*2 G Tn are disjoint if 
supp(Ci) n supp(C2) — 0. If Ci, C2 are disjoint clades both contained in a tree 
T G Tm then we define the distance between clades dxiCi, C2) as the number of 
edges between clades Ci and C2 in T. 

Finally, given T G Tn, let S{T) denote the set of X-splits defined by removal 
of an edge of T. It is well-known 6J that phylogenetic X-trees Ti , T2 € Tn are 
determined up to equivalence (as graphs) exactly when S{Ti) — S{T2). If uj : 
E{T) — >■ M is an edge- weighting, and A\B \s a. split in S{T) with corresponding 

edge e S E{T), set uj{A\B) :— w(e). A distance matrix c G M.^^^ is called an 
additive metric or tree metric if c is a metric, and there exists a tree T € Tn and 
an edge-weighting w on T s.t. 

(a) a;(e) > for aU e G E{T). 

(b) For every pair of leaves {i, j}, — J2e^i^)^ summing over edges e along 
the path from leaf i to leaf j. 

Clearly, given T G 7^ and an edge-weighting uj on T, setting DT.u:{i,j) = 

^ga;(e), for the sum as in (b) above, yields a tree metric Dt,ui & Given a 

tree T Tn and any split A \ B mT, where A, _B C {1, 2, . . . , n}, the split metric 
is defined as D-^l^ = (4'^) where rf^'^ = 1 if « 7^ J and \{i,j} D A\ = 1, and 

d^. ' ^ = else. Thus each split A \ B defines a metric ' ^ from T for which all 
branch lengths equal to zero, except for the branch e corresponding to A \ B. For 
any edge-weighting uj of T, the split metrics for T and the natural tree metric 
Dt,uj are related as below (see, e.g., 0): 

Dt,^^ ^ c.(A|B)i5-4|^. (1) 

A I BeS{T) 

2.2 Amalgamation of Cherries 

The amalgamation of T ^ Tn cherry {i,j} is the subtree T on n — 1 leaves 
obtained by amalgamating the vertices in cherry {i,j} to their common internal 
parent node. In the more formal mathematical language of relations on the set 
of leaves X = {xi, . . . , a;„}, the amalgamation of a cherry {xi, Xj} G X x X to 
its common internal parent node Vij corresponds to a two-step merge obtained 
(without loss of generality) by first merging the nodes Xi, Vij to a new (internal) 
node v'^j, and then merging Xj,v[j to a new (external) node [xi,Xj\, resulting 

in the new tree T on the leaf set X — X — {xi, Xj} U {[xi, Xj\}. 

For example, in Figure [2] amalgamating cherry {1,2} of Ti gives T2 with the 
new leaf labeled [1,2]. Next, amalgamating cherry {[1,2], 3} in T2 produces T3. 
If T' is obtained from T by successive amalgamations of cherries (including the 
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possibility that no cherries are amalgamated, so T' — T), then any leaf i' of T' 
is either present in T as a single leaf, or the result of the amalgamation of leaves 
ii, . . . ,it of T. Hence, leaf i' of T' induces the clade C of T with supp(C) = {i'}, 
in the first case, or supp(C) — {ii, . . . , it}, in the second case. For example, leaf 
[[[1, 2], 3, 4]] of Ti in Figure |2] defines a clade in Ti given by the leaves 1,2,3, and 
4 of Ti . We will call the clade of T obtained from any leaf i' of T' the subgraph 
of T given by i' . 
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Fig. 2. Cherry {1,2} of tree Ti is amalgamated yielding tree T2, where the 
new leaf is labeled [1,2]. Cherry {[1,2], 3} of T2 is amalgamated yielding tree 
T3, where the new leaf is labeled [[1,2], 3]. Similarly, the leaves {[[1, 2], 3], 4}, 
{[[[1, 2], 3], 4], 5} are amalgamated in trees T3 and T4, respectively. 



2.3 Balanced Minimum Evolution: Method, Vectors, and Polytopes 

For a phylogenetic X-tree T e 7^ and a dissimilarity map d G M.^\ there are 
different biologically relevant methods to assign branch lengths (i.e., an edge- 
weighting) to T; in this context, the entry dij of d is most often regarded as 
the distance between any pair of taxa i and j. The balanced minimum evolu- 
tion (BME) method employs a weighted least squares approach for assigning 
branch lengths / : E{T) — >■ M"*" given the dissimilarity map d. Defined by Pau- 
phn [IH], the definition of the edge- weights l{e) (i.e., O Equations (2), (3), 
equiv., (7), (8)]) utilizes average distances between clades whence consequently 
the l{e) are, moreover, linear in the input dissimilarity map d (e.g., see Equa- 
tion (1) in However, for the BME method, the calculation of the total tree 
length 1{T) = J2eeE{T) ^(^) easily stated and quickly computed without 

resorting to computing individual branch lengths Z(e), by the means we now 
describe. For any pair {i,j} of leaves of T, define yfj :— #{ edges between 

leaves i and j }, the topological distance between i and j. Set 

:— {wi2, wf^, . . . , n) € '^^^^ is a vector depending only on the topology 
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of T. Pauplin's [19] formula for the balanced tree length estimation (or estimated 
BME length) liT) is given by 

1{T) = J2 <Aj = • d- (2) 

i,r-i<3 

When necessary for clarity, we will also indicate the dependence on d of the 
estimated BME length w-^ • d by writing /(T, d). Since w-^ depends only on 
the topology of T, but determines /(T, d) given any input dissimilarity map 

d e m|j.'\ we call the BME vector for T. 

In [25], the authors in fact defined terms wj^ for any phylogenetic X-tree 
T (not necessarily binary) in terms of certain cyclic permutations of ("circular 
orderings") of X that respect the structure of T as measured through its set 
of splits S{T). In the case of edge- weighted binary X-trees, one recovers the 
expression for wj^ in the BME vector and Pauplin's formula. [3S] used this 
perspective to establish the consistency of the balanced tree length estimation. 
That is, ii T E Tn has branch lengths to and one takes d = Dt,u: in Equation [2] 
one obtains 1{T) =uj{T). 

The BME method for phylogenetic tree reconstruction (or BME principle) 
can be succinctly stated: find a.T E %i such that Equation [2] is minimized, given 
the dissimilarity map d E M^^), 

Note that one can efficiently compute the input for the BME method, i.e., 
pairwise distances d{i,j), from any given sequence alignment using the maximum 
likelihood estimators (MLEs) under an evolutionary model. The BME method 
for tree reconstruction was shown to be consistent in |13j . 

We recall some necessary definition from polyhedral geometry [22| . The con- 
vex hull of {ai, . . . , a„j} C M" is defined as 

{ m m 
X e M" I X = ^ A^a^, ^ A,; = 1, Ai > 
'i=i 1=1 

A polytope V is the convex hull of finitely many points. We say C P is a face 
of the polytope V if there exists a vector c such that F — argmax^gp c • x. Every 
face F of is also a polytope. If the dimension of V is d, a. face is a facet 
if it is of dimension d — 1. A face is an edge if it is of dimension two. Denote 
the vertex set of a polytope V by vert (7^), where a vertex of a d-dimensional 
polytope is the intersection point of d or more edges, faces or facets. 

With the background on BME above in hand, we now recall the definition 
of the central object of study in this paper, the BME polytope, as it arises from 
the BME vectors. 

Definition 1 (BME polytope). The balanced minimum evolution (BME) poly- 
tope Vn on n leaves is defined as 

Vn :- conv { | T G } . 
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If F is a face of Vn , then its vertex set is given by vert(F) — {w'^^ , ■ • • , w-^™ | w'^' G 
F}, which we may identify with the set of trees {Ti, . . . ,7^}. With this defi- 
nition we can see that minimizing Equation [2] is equivalent to minimizing the 
Hnear objective d G 1^(2) over Vn- Using Day's results it can be shown that 
choosing a minimizing tree for ^ from among the {2n — 5)!! unrooted binary 
trees is an NP-hard problem [HI [H] . Thus it is NP-hard to optimize linearly over 




(b) BME polytope on four 
taxa. 

Fig. 3. All X-trees on four taxa and the BME polytope 7^4 



Example 1 (\14V- For n = 4, there are the 3 binary trees and the star-shaped 
tree as in Figure 3(a) For this case the BME polytope is the convex hull of the 
vectors: 



.T3 



111111 

2' 4' 4' 4' 4' 2 
111111 
4' 4' 2' 2' 4' 4 



.T2 _ 



.^4 



111111 

4' 2' 4' 4' 2' 4 
111111 
3' 3' 3' 3' 3' 3 



Thus the BME polytope Vn for n = 4 is a triangle in E^. Note that the star- 
shaped tree is in the interior of Vn- 



Remark 1 (fT^). The n 

(2) - - 



th 



BME polytope Vn lies in E^^J and has dimension 
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2.4 Clade Faces 



With minor modifications of the proof of BME consistency in [T^ we wiU show 
that any collection of disjoint cladcs defines a face of the BME polytope. This will 
also be proved independently and constructively in Section [5] using the Cherry 
Forcing Algorithm. 

Lemma 2. Let Ci, . . . , Cp € Tn be a pairwise disjoint collection of eludes. There 

exists a c € 1^(2) such that argmaxw-'" • c = { T e Tn \ Ci, . . . , Cp G T }. 

rer„ 

See the appendix for a proof of Lemma [2] 

Lemma [2] proves that every disjoint set of clades corresponds to a face of 
V„ which we define as follows: Given a set of disjoint clades {Ci, . . . ,Cp\Ci S 
Tn, V 1 < « < p}, we define a clade-face of the BME polytope P„ by -F'ci,...,Cp ■= 
{T £ Tn \ Ci,...,Cp e T}. Moreover, the face i^Ci,...,Cp is the image of an 
affine transformation of the BME polytope Vi, where I := n — " l)- 

This follows since every tree in Fci,...,Cp can be constructed by starting with a 
binary tree on / leaves and attaching the clades {Ci, . . . , Cp} to p of the I leaves. 

Looking ahead to Section [3] one can see that the three trees corresponding 
to a nearest neighbor interchange (explained therein) form a clade-face, as an 
immediate consequence of Lemma [2j This suggests that NNI and SPR moves 
yield edges of the BME polytope Vn , but this fact will require additional proof. 
Leading into this, we provide a proposition which holds for any polytope in 
general, and will be key to our further arguments. Roughly speaking it states 
that if the entries of Ci are significantly larger than the entries of C2, then 
when linearly optimizing Ci + C2 over a polytope V then Ci must be maximized 
foremost. 

Proposition 1. Let V C M™ he a polytope and Ci, C2 G M™. // 

min Ci • X — Ci • y > max |c2 • x — C2 • y| (3) 

x,y(^ vcrt(7^) x.y (Evert (P) 

xt^argmax^g-p ci-z 
ci-x^i^ci-y 

then 

argmax (ci + C2) • z C argmax Ci • z. 

z6vcrt('P) zGvcrtCP) 

See the appendix for a proof of Proposition [l] 

Consider two clade-faces Fci,...,Cfc and Fcj,...,c', where |-F'ci,...,Cfc I > 1- Then, 
c" ^ P'ci Ck if and only if for every 1 < i < fc, clade Ci is contained in, 

1'"'' k' '■■■) 

or equal to, clade C' for some 1 < j < k' . If Fc. ... Ck — {^}: then Fc' ... c' Q 
FGi,...,Ck if Pc[....,c' , — {T}- We note that this induces a partial order on the 
clade- faces of Tn, and gives a lattice if one also considers Vn and the empty set 
as clade-faces. 
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3 SPR Adjacency Implies BME Adjacency 



A subtree-prune-regraft (SPR) move on a tree T e T^j is determined by choosing 
a clade of C of T, pruning it from T, and amalgamating the two internal edges 
originally connecting C to T to one edge. Finally an internal edge of T is chosen, 
a node is inserted, and C is attached to this node. For an example, see Figure |4j 
Thus, T is changed to another binary tree on n leaves and we say the two trees 
are adjacent by an SPR move. 

A nearest neighbor interchange (NNI) move on a tree T G 7^ is determined 
by choosing an internal edge e £ E{T), , and rearranging the four subgraphs 
(clades) that e induces. It is not difficult to see then that an NNI move is also 
an SPR move. The following lemma is an application of Proposition [l] applied 
to a face of the BME polytope, and two clades contained in the face. 

Lemma 3. Let F he a face of the BME polytope Vn, where Ci, C2 are disjoint 
clades with Ci,C2 G T, VT e vert(i^). There exists an objective c S ^(2) such 
that 

argmaxw^ • c = { T e vert(F) | fiT(Ci, C2) > dT'(Ci, C2), VT' e vert(F) }. 
TGr„ 

Similarly there exists an objective d £ R^^) such that 
argmaxw^ . d = { T e vert(F) | dT(Ci, C2) < dT'(Ci, (^2), VT' G vert(F) }. 

For a proof of Lemma [3] see the appendix. 

Corollary 1. Every NNI move corresponds to an edge of the BME polytope. 

Proof. Simply take the objective c given by Lemma [2] which yields the face of 
the three trees corresponding to an NNI, then add the extra criteria that either 
two of the clades are as close or as far as possible, and apply Lemma |3] 

□ 

We now present our result that any pair of two trees adjacent by an SPR 
move yields an edge of the BME polytope. 

Theorem 1. If Ti,T2 G Tn are adjacent by an SPR move, then there exists 
c e k(S) such that w^i • c = w'^^ • c> • c for all T e 7;\{ri, T2}. 

Proof. Let Ti,T2 G Tn be adjacent by an SPR move. Any such move can be 
described by Figure [ij where Ci, . . . , Cp (labeled 1, ... ,p in Figure [i]) are clades 
common to Ti and T2 and clade Cp is the subtree that is pruned and regrafted. 
By Lemma 2 there exists an objective Ci G 1^(2) for the clade-face -Fbi,...,Cp- 
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(a) (b) 

Fig. 4. Here 1, . . . ,p are subgraphs (clades). Two trees, Ti (a) and T2 (b), adja- 
cent by an SPR move where subgraph p and its connecting edge is the subgraph 
pruned from Ti and regrafted between subgraph (p — 2) and its internal node. 



Note that Ti,T2 € -Fbi,.. .,Cp: but in general -Fbi,. ..,Cp will contain more trees, 
hence further restrictions need to be placed on the objective. By repeated use of 
LemmalSl and the objective Ci which defines -Fc'i....,Cpj there exists an objective 
C2 € M.^^ such that in this order of importance, 

1) the distance between clades Cp_i and Cp-2 is maximized, 

2) the distance between clades Ci and C2 is minimized, 

3) the distance between clades C2 and C3 is minimized, 

4) the distance between clades C3 and C4 is minimized, 

p-4) the distance between clades Cp_4 and Cp-3 is minimized, 

for trees in Fc-^^...^Cp- Since Ti and T2 contain the clades Ci,...,Cp and the 
properties in the previous list are satisfied in the prescribed order, we see that 
{Ti,T2} C argmax^gp^ w-^ • C2, but the latter may also contain the trees with 
the clades in the dashed box of Figure |4] inverted vertically. 

Select leaves i e Cp_i and j € Ci such that dT2{i,j) < dT^im^n), \/{m,n) G 
Cp-i X Ci. Similarly, let k e Cp-2 and I £ Cp-3 such that dxi (fc, I) < dx^ {m, n), 
^{m,n) e Cp-2 X Cp_3. 
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Let d — w'^i Gij + wfj' Gki ■ Note that '2.wJ^ = wfj' and w'^l — 2wJ^f . It follows 
then that w'^'^ ■ d = w-^^ • d. There exists e > small enough such that 

min C2 ■ X — C2 • y > max |ed • x — ed • y|. 

x.y (Evcrt(7^) x,yfEvcrt(7^) 
x(Earginax2£-p C2'Z 

C2-X^C2-y 

Therefore Propositionjljholds and the objective ed optimized over argmax^.^^^ w 
C2 gives trees such that either clades Ci and Cp_i are as close as possible or 
Cp-2 and Cp-2 are as close as possible. Therefore argmax^g^^ • (c2 + ed) 
contains only Ti and T2. 

□ 



T 



4 Cherry Forcing Objectives 

The following lemma is a sufficient condition for vectors c'^, G to satisfy 
Proposition [T] on the BME polytope Pn- 

Lemma 4. Let G R^"^ where for a fixed K > 0, cj^ — K for all {i, j} G 

supp(ci). // 

then 

argmax (c^ + c^) • w"'" C argmaxc^ • w"'". 

See the appendix for a proof of Lemma [i] If a triple [c^ ^c^ ,Vn) satisfies the 
assumptions and hypothesis of Lemma |4] and Equation |4] then we say it satisfies 
the dominance condition. 

Given a clade C in 7^ as input, the idea of the Cherry Forcing Algorithm 
is to iteratively fill in entries in an objective c G E^^) to satisfy the dominance 
condition given in Equation |4] in such a fashion that respects C. More precisely, 
under the Cherry Forcing Algorithm, (1) a small part of the topology (e.g. cherry) 
of C is fixed, and (2) subsequently filled-in entries in c will be sufficiently small 
such that no previously fixed structures of C will be broken when maximizing 
c • W"^ over Vn- It is proved in Lemma [s] that the sum c = of the outputs 

of the Cherry Forcing Algorithm will yield the normal vector to the face of the 
BME polytope Vn that consists of all trees that contain C . If C is an entire tree, 
then the vector c is in the normal cone of the tree C of Vn- That is, c • w'^ is 
maximal only when C = T. 

Algorithm 1 (Cherry Forcing Algorithm) 

1: input T G Tn, a, clade C of T . 
2: output c\ c^, . . . , c* G M+ ^ . 
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Initialization: Let Ti := f , Ki := 1, t := 1, and :^ e M^^). 



repeat 

Pick a cherry {k,l} ofTt, s.t. the subgraphs of T given by k and I are in 
clade C . 

6: Let Gk be the nodes of the subgraph of T given by k. 
7: Let Gi be the nodes of the subgraph of T given by I. 
8: for every pair {p, q\ £ Gk x Gi do 
9: Let ci — ^ 



(;)■ 



10: Let Kt+i:= 

11 
12. 
13. 



Lett :=t+l. 

Let Tt :~ Tt-i where cherry {k, 1} is amalgamated, 
until T' has a single leaf corresponding to the entire clade C of T or T' is 
the star tree on three leaves. 
14: return c^,c^, ... jC*^. 

Lemma 5. Let T ^Tn and clade C ofT be the input of Algorithm^with output 
C""^, c^, . . . , c*. Every triple 

{c\c2 + ... + c*,P„}, 
{c2,c3 + ... + c*,7'„}, 



satisfies the dominance condition in Equation^ Consequently aigma.Xrp^j-^Q2^^^^ c')- 
■w'^ = {T e Tn \ C is a clade ofT}. 

A proof of Lemma [5] is provided in the appendix. 

Lemma 6. Let T £ Tn and clade C of T be the input of Algorithm [i] with 
output c^, c^, . . . , c*. Then supp(c') n supp(c^) = for all 1 < i < j < t . 

A proof of Lemma [6] is provided in the appendix. 



5 Non-empty Intersection of NJ and BME Cones 

The NJ Algorithm, first presented in [21], is a consistent distance-based method 
to reconstruct a phylogenetic tree. Yet, its biological interpretation and what 
criteria it optimized have only been established recently. Some initially argued 
that NJ optimized an ordinary least-squares criteria at each step, while others 
contended that it did not optimize any criteria. See [TH] for a short history of 
NJ. However in [TH], it was shown that in fact, NJ greedily minimizes the BME 
criteria at every neighbor joining step. In [14j Eickmeyer et. al. characterized 
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those dissimilarity maps for which the output of the NJ Algorithm is in fact the 
BME tree, by a comparison of the NJ cones with the BME cones, for eight or 
fewer taxa. 

Given a tree topology T £ Tn with branch lengths lu, it follows from consis- 
tency that NJ and BME will return T given the tree metric £'(t,w) defined in 
Section 2.1. That is, D(T,i^) will lie in at least one NJ cone of T. The order in 
which NJ picks cherries depends on the dissimilarity map, and the dissimilarity 
map depends on the branch lengths. Therefore which NJ cone £'(t.lj) lies in is 
strictly determined by the branch lengths lu. However, if a NJ cone C of T is fixed 
it is not clear how branch lengths, call it lu' , can be assigned to the tree topology 
of T such that -D(t,w') is in C. Thus, it is not clear that consistency directly 
implies that the BME cone and every NJ cone have non-trivial intersection. 

Our result is that every NJ cone associated to a tree topology T £ Tn has an 
intersection of positive measure with the BME cone. That is, for any NJ cone 
associated with the particular order to pick cherries and the tree T, there is an 
intersection of positive measure with the BME cone associated to T, where the 
BME cone is defined as the set of all dissimilarity maps d G such that 

argmaxjn/g d • w'^ 3 {T}- 

The NJ Algorithm takes as input a dissimilarity map c G E^^) and builds 
a tree T G Tn [H]- It involves: 1) picking a cherry 2) creating a node a 

joining taxa i and j, 3) computing the distances from other nodes to the new 
node a, 4) repeating the procedure until the number of leaves n is 3. 

The main problem is picking the cherry. A solution, suggested by Saitou and 
Net [5T] and subsequently modified by Studier and Keppler 2Bj, relies on the 
Q-criterion in Theorem [2] below. 

Theorem 2 (Cherry-picking criterion (Q-criteria) [21', '26]). Letc G M^^) 
be an additive tree metric for a tree T £ Tn and define the n x n-matrix Qc with 
entries: 

n n 

QciiJ) = (n - 2)cjj - ^ Cj,fe - ^ Cfcj = (n - 4)ci,j ~ ^ c^^k - ^ Ck,j- (5) 

fe=l k = l k=^j k^i 

Then any pair of leaves {i* for which Qc{i* is minimal, is a cherry in 
the tree T. 

If the NJ Algorithm selects taxa {k,l} as a cherry, and a is the new node 
joining {fc, 1} then the new dissimilarity map c' G m( ^ ) jg defined to be 

else c- 

Lemma 7 (Shifting Lemma |15p . Let c,x £ M^^) where x = (1,1,..., 1). 

Then the Neighbor Joining Algorithm applied to c + fcx, for any G M, returns 
the same tree as the Neighbor Joining Algorithm applied to c. Moreover, the 



Ci,k + Ci ; — Ck.l 
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linear ordering of the Q-criteria of c is the same as the linear ordering of the 
Q-criteria o/ c + /cx, i.e. ifQc{ii,ji) < (<)Qc(«2,j2) then Qc+/cx(«i, ji) < (< 

) Qc+fex(«2, J2)- 

Lemma 8. Let c^,c^ G m!^^ where |supp(c^)| ~ 1, and (c^^c^jV) satisfies the 
dominance condition (Equation^. Further, let c = c^+c^, {{Pj?}} — supp(c^), 
and Q-c be the Q-criteria calculated from the dissimilarity map —c. 

Ifn>A, Q^c{p,q) < Q-c{i,j) for all {i,j} ^ {p,q}. If n ^ A, andp,q,r,s 
are the leaves, then Q^c{p,q) = Q-c{'>',s) < Q-dhj) where {i,j} 7^ {p,<l} o,nd 
{hj} 7^ {r. s}. 

See the appendix for a proof of Lemma [8] 

Theorem 3. Let T GTn be the input for Algorithm^ with output {c^, . . . , c*}. 
The Neighbor Joining Algorithm with input — (c^ + • • • + c*) returns T. 

Proof. We proceed by induction on n. If n = 3 we have the star tree and there is 
nothing to be done. If 71 = 4 then Lemma[5]apphes to the output of Algorithm [T] 
and we are done since NJ will return T by Lemmajs] Consider n > 4. Let T GTn 
be the input for Algorithm fTl with output {c^, . . . , c*}. Define c = + • • • + c*. 
Note also that by Lemma ]6] supp(c^), . . . , supp(c*) are pairwise disjoint. We 
know Lemma [s] applies and implies by Lemma argmax^^^^^ W"^ • c = {T}. Let 
{{p, g}} = supp(c^) be the first cherry pickedin Algorithm [l] and let Q-c be 
the Q-criteria of — c. By Lemma [s] Q-c{p,q) is the minimal element in Q-c- 
Consider the shifted vector d := 1 — (")c e and note that dp g — 0. 

The Shifting Lemma [t] implies that Qd(Pi<z) will be the minimal element in Q^. 
Thus, the NJ Algorithm will join leaves p and q to the new node a. Consider 
the new dissimilarity map d' e m( 2 ) given by the NJ Algorithm. If i ^ a and 
j a then d'^ j = dtj. For i ^ a, d[^^ = ^{di^p + di^q - dp^q) = ^(di.p + rfi.g), 
since dp^q = 0. Since the cherry {p,q} was designated first by Algorithm [l| by 
construction, for all i ^ p and i ^ q, c\ p = c\ ^ for all 1 < Z < <. This implies 

a = '^i.p = ^1,1 i ^ a. Define c := 1 — d' and c^, . . . , c* G ^ ' as follows 



p ^ fcij if {j, j} e supp(c') 
''i \ else 

for 2 < / < i. Observe that |supp(c^)| — 1 since corresponds to the cherry 
picked in the amalgamated tree T2 in Algorithm [ij and p and q have been 
identified with a. Moreover, every triple (c^,c'^ + ■ ■ • + c*,Vn-i), (c^, c"* + • • • + 
c*,Vn-i)j (c*""'^, c', T'n.i) satisfies the dominance condition of Equation [4j 
Thus argmaxjig^^ c • = T' for some T' £ T^. Since c is a dissimilarity map 
on n — 1 leaves, the induction hypothesis holds, and NJ returns T' . Finally T' is 
contained in T as a clade, which implies NJ on — c will return T, since T equals 
the tree T' with leaves i and j connected to leaf a by two different edges. 

□ 
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Given a fixed tree topology, Algorithm [T] allows for any choice of neighbor 
joining pairs (cherries in the NJ Algorithm), and every such choice yields a 
different NJ cone. Thus, Theorem [3] implies that every NJ cone and BME cone 
have a non-empty intersection. 

Corollary 2. Every NJ cone C associated to a fixed T CzTn has an intersection 
of positive measure with the BME cone associated to T . 

Proof. Let T e 7^ be a tree topology and C be a NJ cone associated to T; recall 
C is also dependent upon an order of picking cherries. Now apply Algorithm [T] 
with T as the input (as both the tree and clade), choosing cherries in step 5 by 
the order associated to the NJ cone C, and let {c^, . . . , c*} be the output. By 
Theorem [sj the BME and NJ algorithm with input — Xli=i '^i^^ each return 
T. Moreover, since the cherries were chosen in step 5 to be consistent with C, 
we have — ^ ^■ 

Since the BME cone associated to T is convex (as a normal cone of the 
BME polytope), and argmin2.,g^^(— c*) = {T} by Lemma [sj it foUows 
that — li^s in the interior of the BME cone associated to T. 

On the other hand, individual NJ cones are convex (by definition, or see 
[Tl]) and the boundary of the intersection of multiple NJ cones associated to the 
same tree topology corresponds to two or more cherries having equal Q-scores 
(i.e., Q-criteria entries) at some step in the NJ Algorithm [14]. Lemma [s] implies 
that the first cherry chosen by the NJ Algorithm will be supp(c^), that is, it has 
the smallest Q-score with no ties. Moreover, in the proof of Theorem |3] we see 
that the new dissimilarity map derived from — Algorithm also 

satisfies the dominance condition. Hence, Lemma [H] holds again, and there are 
no ties in the Q-score. Therefore, there will be no ties in the Q-score, except for 
the case of four taxa. For four taxa, the only ties present are the trivial ones: If 

5 is the set of four taxa {\S\ = 4) then by definition of the Q-score, 

Q{p, q) = Q{r, s) V {p, q} C S, {r, s} ^ S \ {p, q}. 

We note that these trivial ties do not correspond to different NJ cones, and hence 
— Y^l^iC^ lies in the interior of C. 

In conclusion, we see that — J2l=i ^i^^ interiors of both the BME 

cone and the NJ cone C. This implies they have an intersection of positive 
measure. 

□ 

6 Discussion 

Mathematically, "closeness" between trees is measured via differing distances 
(metrics) on tree space Tn, including the popular distance measures dNNi{T, T'), 
dsPR{T, T'), and drsRiT, T') describing the minimum number of nearest neigh- 
bor interchange (NNI) (resp., subtree-prune-regrafting (SPR), tree-bisection- 
regrafting (TBR)) moves needed to transform T to T' for T,T' G T„. Each 
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such metric M yields a notion of adjacency, with T, T' e Tn being M adjacent if 
dM{T, T') = l. The comparisons of two trees T, T' as NNI, SPR, or TBR adjacent 
confers useful biological information, including providing the basis for multiple 
tree reconstruction algorithms p [^5H^ . For T,T e Tn , set dBME{T,T') = 1 
if and w-'" are two vertices joined by an edge in the BME polytope Vn- This 
yields another notion of adjacency in the BME setting. 

The point of view of this paper is that knowledge of BME adjacency, and 
its relationship to NJ adjacency, has likewise the potential to inform our un- 
derstanding of tree space 7^, and the gene and/or species trees its elements 
represent. We have explored some relationships between adjacency for M = 
NNI, SPR, TBR, BME, and NJ. It is well-known that an NNI move is a special 
case of an SPR move and an SPR move is a special case of a TBR move. In this 
paper, we have shown that SPR adjacency implies BME adjacency. However it 
is not known that TBR adjacency implies BME adjacency. We have made some 
initial explorations in this regard, including using an additional related notion 
of "circular adjacency" predicated upon the circular orderings employed in |25j . 
However, having seen no examples to show that TBR adjacency fails to imply 
BME adjacency, we propose the following conjecture. 

Conjecture 1. If T, T' e 7^, then dTBB.{T,T') = 1 implies dBME{T,T') — 1. 

Considering further the potential applications of such adjacency notions in the 
context of the BME polytopes and BME cones is a topic we hope to explore in 
a future work. 
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A Appendix 

Proof (Lemma^. 

The proof of the lemma relies almost entirely on the proof in [T3] of the con- 
sistency of the BME method for phylogenetic tree reconstruction ([13j Theorem 
2 and Appendix 3]]). Given T E Tn with edge- weighting w, recall the notation 
of Section 2.1 and 2.3. Furthermore, from Equations ^ and ([2]), for any tree 
T' G Tn one can obtain the estimated BME length of T' as a linear function of 
the metric D := Dt,uj as 

1{T',D) ^w^' ■ D = J2 i^iA\B)l{T',D^\^) (6) 

A I BeS(T) 

By the consistency of the BME tree length estimation, 1{T,D) = 1{T) = 
oj(T). So, for the proof of the consistency of the BME method it sufficed for [T3] 
to demonstrate that 1{W, D) > 1{T, D) for dl\W €Tn with Vl^ ^ T. By Equation 
^ , it was enough to prove this inequality holds for any split metric I ^ of T 
in place of D. Likewise, for our proof of Lemma [2] we consider the tree T G Tn 
such that Ci, . . . , Cp G T. Furthermore, we take an edge- weighting w of T for 
which uj{e) = if e ^ Ci, for all 1 < i < p. We will show that iiWGTn contains 
Ci, . . . , Cp then 1{W, D) = 1{T, D). Otherwise if C, ^ for some I < i < p, 
then we show 1{W, D) > 1{T, D). Both parts proceed by reducing to the case of 
split metrics D'^ I ^ for T, and drawing upon the results in [13] . 
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Consider W ^Tn such that Ci, . . . , Cp € VF. Since d^W^T for all 1 < i < 
p, S[W) n S{T) contains any split A \ B induced by any edge e G Ci, . . . , Cp. As 
shown in [T3] (by direct calculations using the definition of the BME branch 
lengths 0, if a split ^ | B is both in S{T) and S{W) then 1{W,D^\^) = 
l[T,D^\^) = l.liW then there exist some split A\Bin S{T) but not in 
E{W), and uj{A \B) = 0. Thus 1{W, D) = 1{T, D). Now consider W Tn such 
that Ci ^ W for some 1 < i < p. As above, there exists some split A | i? in S{T) 
and not in S{W), and moreover uj{A\B) > 0. Under these circumstances, the 
argument in the (remainder of the) proof of Theorem 2 of |13l applies to show 
1{W, I ^) > 1{T, I which suffices to complete the proof of Lemma [2] 

□ 



Proof (Proposition 1). Let a, b G vert(7') and suppose a G argmax z • Ci ^ b. 

zevcrt('P) 

Thus a • Ci > b • Ci^T^hen 



smce 



a • (ci + C2) — b • (ci + C2) = a • Ci — b • Ci + a • C2 — b • C2 
> min Ci • X — Ci • y + a • C2 — b • C2 > 0, 

x,y(Evcrt(7^) 
xtEargmaXj.gp ci-z 
ci-x^ci-y 



mm Ci • X — Ci • y > max |C2 • x — C2 • y| 

x,yG vcrt(7-*) x,y(^ vort(P) 

xGargmaXg.^^ ci-z 
ci-x^ci-y 

Thus b ^ argmax (ci + C2) • z and therefore 

zevort('P) 

argmax (ci + C2) • z C argmax Ci • z. 

zevcrt('P) zevcrt('P) 



□ 



Proof (Lemma^^. Let d G M^^) be the normal vector of the face F. Let i be a 
leaf of clade Ci and j a leaf of clade C2. Now apply Proposition [T] as follows: 
There exists e > sufficiently small such that 



mm 

x,y(iVcrt(7-*Ti) 
xG argmax^ ^-p^ z-d 
dx/dy 



d • x d • y > 



x,y^iVcrt(PTi) 



|£(-e„-) •x-e(- 



■y| 



Therefore 



argmax w"^ • (d — £6^) = argmax w"^ • (— ee^j) 

T6r„ Tevcrt(F) 



which are precisely all trees contained in vert(F) such that clades Ci, C2 are 
farthest apart. To show there exists an objective corresponding to a face con- 
tained in vert(i^) such that clades Ci, C2 are close as possible, simply change 



G^j to Gjj" . 



21 



□ 



Proof (Lemmal^. Consider the left-hand side of the inequaUty in Equation |3] 
apphed to c^,c^and Vn '■ 



1 ,.,T \{i,j}esupp(ci) 



First note that < wfj < \ for aU T ^Tn and all {i, j}- Thus, 



{i j}esupp(ci) 



will be integral and greater than 0. This implies that 



Eiwfi — wfi ) > 7J ■ 

{ij}esupp(ci) 



Hence the expression in Equation [t] will be greater than or equal to js^- Using 
the bounds on wf^ for all T and {«, j}, and the triangle inequality, 



-c^ • (1, 1, . . . , 1) > max Ict • w"^ — Cq • w"^ 



Therefore if Equation [4] holds then Proposition [T] holds, completing the proof of 
Lemma m 



□ 



Proof (Lemma^ . Let T £Tn and cladc C of T be the input of Algorithm [l] with 
output c^, c^, . . . , c*. Moreover, let Ki, ■■, Kt be the ordered list of Ki used. 
Let 1 < r < t and from Step 10 in Algorithm jlj we see that Kr+i < 



1 1 Kr 

for 1< Z < i - r. 
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Then 



c 



+ c'-+2 + ... + c*) .(l,!,...,!) 



V {4,j}esupp(c'-+i) {ij}esupp(c'-+2) {ij}esupp(c*) 



~2 1 ^ C"~\ 2^ ^ T^n 

\ {z,j}Gsupp(c^+^ ) ^2/ {z,_7}^supp(c^+2) ^2/ {z,_7}Gsupp{c*) ^2/ 

<i(X^+l + ---+i^t) 

1 / 1 1 iiT, 1 1 if . 

< 



2 I 2 2"-2 Q) 22 2"-2 («) 
1 1 if, /I 1 1 



2 2"-2Q) V2 22 2*-'7 (2)2""^ 2" 




2 



for e supp(c''). Thus (c'',c''+^ + - • • + c*,7'„) satisfies the dominance condi- 

tion for 1 < r < < and Lemma |4]apphes. Therefore argmaxj^g-^^ (c''+^ + • • • + c*) • 
C argmaxrpg-^ c'' • w-^. Altogether this implies that argmax^^g^ (c^ + • • • + 
c*) • w"^ is the set of trees where, and in this order, • w"^ is maximized, c'^ ■ 
is maximized, . . ., c* • w"^ is maximized. But this recursive linear optimization 
of c^, . . . ,c* over Vn precisely forces the amalgamation of cherries determined 
in Algorithm [1] 

□ 

Proof (Lemma^. 

Suppose not. That is, for some 1 < i < j < t and some I < k < I < n, assume 
cj.; > and c^^ > 0. Since i < j this implies leaves k and I are contained in two 
separate leaves of Ti in Algorithm [l] Moreover since c\i > 0, this implies the 
leaves of Ti that contained k and / were amalgamated, giving T^+i. Thus, leaves 
k and / will never appear in separate leaves of for any r > i. But, c^.^ > 0, 
implying k and I appear in separate leaves of Tj , a contradiction. 



□ 

Proof ( Lemma 8). Let c^^c^ e M^^'^ where |supp(c-'^)| = 1, and (c^,c2,P) sat- 
isfies the dominance condition (Equation |4|. Further, let c = -t- c2, {p, g} = 
supp(c^), and Q-dhj) be the Q-criteria calculated from the dissimilarity map 
— c. If n = 4, and p,q,r,s are the leaves, then we see directly that Q^c{p,q) = 
Q-cir, s). Moreover c^^ appears in Q-dp, r), Q-dq, «), Q-dP^ s), Q-c(9, r), and 
not in Q-c{p,q) and Q^dr^s). The dominance condition implies Q-dP,<l) — 

Q-dr, s) < min (Q-dP, r), Q-dl, s),Q-dP, s), Q-c(9, r)j . 
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Now consider n > 4. Since {c^,c^,P) satisfies the dominance condition, it 
follows that 



2n-2 ~ 2n-2 ^ ^^'^ ) ~ 2 ( ^ ^^'^ 

\l<i<3<n I \{i,j}jt{p,q} 



1 



{i,j}7^{Pi9} 

1 



This implies 

Q-C(P, a) = - 4)Cp,g + ^ Cp,fe + ^ Cfe,g <-{n- 5)Cp,g - ^Cp,g. 

Furthermore, gCp^g > Cjj for all ^ {p, <?}, since the dominance condition 

is satisfied. Finally 

Q-c{p,q) < -{n-5)cp^q-^Cp^g < -{n-5)cij-Cij = -{n-4)cij < Q-c{i,j). 

□ 
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